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PREFACE 


This  Memorandum  Is  written  to  supplement  the  statistical  training 
of  maintenance  personnel  In  the  Reports  and  Analysis  section  at  the 
base  level.  (It  should  not  be  regarded  as  an  Introductory  text  In 
statistics.)  Since  It  Is  Intended  ns  an  adjunct  to  self -tutoring ,  It 
Is  written  In  language  that  the  average  NCO  of  the  analysis  section 
should  find  familiar  and  comprehensible.  At  critical  points,  statist! 
cal  notation  Is  explained  so  the  reader  can  supplement  his  background 
by  studying  standard  texts  In  statistics. 

While  making  preliminary  Investigations  for  RAND  research  con¬ 
cerned  with  maintenance  management ,  the  author  worked  with  the  15th 
Air  Force  In  the  analysis  of  the  SAC  Full-Force  Project.  It  was  dis¬ 
covered  that  the  Reports  and  Analysis  (R&A)  personnel  had  not  been 
exposed  to  certain  statistical  methods  that  are  crucial  In  answering 
a  number  of  questions.  These  methods,  Involving  the  use  of  chi-square 
and  analysis  of  variance,  are  not  presently  taught  In  the  R&A  training 
program,  even  though  they  will  handle  a  majority  of  the  statistical 
problems  usually  arising  at  base  level .  It  was  further  discovered 
that  R&A  personnel  could  readily  learn  these  methods  and  just  as 
readily  put  them  to  excellent  use;  their  deep  knowledge  of  the  mainte¬ 
nance  system  made  the  tutoring  process  easy. 

Incorporating  the  suggestions  of  several  participants,  this  study 
formalizes  the  tutoring  effort  that  resulted  from  the  Full-Force 
experience.  The  Memorandum  amounts  to  a  simple  translation  of  an 
elementary  statistics  text  Into  the  context  of  weapon-system  analysis. 
Although  most  of  the  examples  are  taken  from  SACR  66-7  Information, 
the  methods  are  equally  applicable  to  data  generated  by  other  major 
air  commands. 

This  study,  then,  should  be  of  Interest  to  all  personnel  respon¬ 
sible  for  evaluating  the  effects  of  maintenance  at  any  air  base.  £t 
should  also  prove  useful  In  resolving  many  of  the  analysis  problems 
that  arise  at  headquarters  level. 


SUMMARY 


This  Memorandum  explains  Che  use  of  two  statistical  tools,  chi- 
square  and  analysis  of  variance  ,  in  the  context  of  maintenance  prob¬ 
lems  that  arise  at  air  bases. 

These  tools  are  particularly  useful  for  examining  the  familiar 
day-to-day,  month-to -month  variation  in  maintenance  numbers,  the 
crucial  question  being  whether  given  variations  are  the  result  of 
normal  random  fluctuation,  or  whether  they  are  abnormal  --  in  which 
case  they  probably  merit  the  expenditure  of  maintenance  resources 
for  corrective  action. 

Since  most  maintenance  measures  are  frequency  counts,  the  empha¬ 
sis  is  on  methods  for  dealing  with  this  kind  of  data.  Ihus  ,  follow¬ 
ing  a  brief  introduction,  the  Memorandum  asks  a  typical  maintenance 
question:  "How  is  alert  affecting  my  break-rate?"  The  answers  show 
how  the  chi-square  test  may  be  used.  The  method  is  elaborated  with 
several  examples. 

Following  this,  two  other  maintenance  questions  demonstrate  how 
simple  analysis  of  variance  may  be  used:  "What  effect  is  the  100- 
hour  periodic  having  on  subsequent  missions?"  and,  "Do  we  have  any 
bomb-nav  systems  going  sour?"  This  Section  includes  several  sugges¬ 
tions  for  easing  the  burden  of  computation.  These  suggestions  are 
expanded  in  Appendix  A,  which  shows  how  to  use  the  PCAM  to  reduce 
the  clerical  effort. 
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I.  INTRODUCTION 


Only  one  thing  is  certain  about  maintenance  numbers:  they 
vary.  Regardless  of  what  the  break-rate  was  last  month,  this  month 
it  will  be  either  up  or  down.  This  variation  is  characteristic  of 
stochastic  phenomena,  and  it  is  at  the  heart  of  our  problem.  We 
know  the  numbers  will  change  as  a  part  of  the  normal  fluctuation 
(i.e.,  random  variation).  What  we  need  is  a  method  for  determining 
whether  the  change  has  exceeded  the  bounds  of  normal  fluctuation. 

As  one  Deputy  Commander  for  Materiel  put  it:  "I  know  there  has 
been  a  change;  what  I  want  to  know  is,  should  I  worry  about  it?" 

How  to  answer  his  question  is  the  objective  of  this  study. 

For  example,  our  first  illustration  will  be  concerned  with 
determining  whether  a  two-week  stay  in  ground  alert  has  had  a 
deteriorating  effect  on  an  aircraft.  The  numbers  show  that  a 
number  of  first  sorties  after  alert  indicated  an  increase  in  the 
number  of  Form  126  write-ups  as  contrasted  with  other  sorties. 

The  question  is:  Should  we  attribute  this  increase  to  normal 
variation,  or  to  the  effects  of  the  stay  in  alert?  If  the  increase 
is  only  random  variation,  we  will  not  want  to  waste  manpower  on 
special  measures.  Corrective  action  is  the  proper  course  if  (and 
only  if)  the  difference  is  significant  --  that  is,  abnormal.  Sta¬ 
tistical  techniques  enable  us  to  answer  that  question  and  thus 
conserve  our  maintenance  resources  for  dealing  with  abnormal  variation. 

The  following  introduces  Air  Force  maintenance  personnel  to 
chi-square  and  analysis  of  variance,  and  shows  how  these  methods 
can  be  helpful  in  running  the  maintenance  show. 
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II.  CHI-SQUARE:  THE  ONE-WAY  CASE 

Consider  Che  following  questions:  "How  does  a  long  period 
of  alert  status  affect  an  aircraft?"  "Is  alert  degrading  the 
potential  EWO  (Emergency  War  Order)  effectiveness?"  "if  so,  what 
systems,  if  any,  are  most  vulnerable?"  "Can  any  steps  be  taken  to 
prevent  the  degradation?" 

To  begin  answering  these  questions  and,  incidentally,  to  begin 

2 

learning  a  little  about  the  use  of  chi-souare  (y  ,  pronounced  "ky- 
square"),  the  following  numbers  were  collected: 

1.  Two  samples  of  the  number  of  write-ups  on  SAC  126  forms 

of  39  and  43  B-47  training  sorties  (204  and  221  write-ups, 
respectively) . 

2.  A  third  sample  of  124  write-ups  produced  by  18  sorties 
flown  immediately  after  the  aircraft  came  off  alerc. 

3.  Three  similar  samples  produced  by  rolling  dice.  If  we 
use  dice,  we  can  be  sure  that  the  resulting  variation 

is  due  to  randomness.  (In  order  to  get  numbers  resembling 
the  actual  maintenance  data,  three  dice  were  used  and 
three  was  subtracted  from  each  throw.  Cur  highest  throw 
was  16  so  the  highest  number  in  the  table  is  13.) 

The  numbers  themselves  (Table  1)  do  not  show  us  any  differences, 
either  between  samples  or  between  a  column  of  sortie  write-ups  and 
the  adjacent  column  of  dice  rolls.  If  they  were  not  labeled,  we 
could  not  tell  which  set  was  which.  Therefore,  the  conventional 
thing  to  do  is  to  find  the  means  (simple  averages)  of  the  numbers. 
Table  2  lists  the  average  wrice-ups  per  sortie  for  the  samples 
shown  in  Table  1. 

In  Table  2,  we  have  attempted  to  capture  the  essence  of  this 
Memorandum's  message.  Both  sets  of  samples  show  differing  means. 

We  know  the  differences  are  random  in  the  dice-produced  samples. 

With  the  aircraft  samples  we  are  not  sure.  Although  the  post-alert 
sample  differs  from  the  two  regular-flyer  samples,  we  do  not  know 
whether  the  difference  is  within  normal  bounds.  Our  quandary  is 
only  heightened  by  Table  3,  which  regroups  the  data  in  Table  1  to 
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Table  1 

A  COMPARISON  OF  ACTUAL  WRITE-UPS  OF  TRAINING  SORTIES 


WITH  "WRITE-UPS 

"  PRODUCED 

BY  DICE 

ROLLS 

Number  of  Write- 

Ups 

Sample  I 

Sample  2 

Sample  3 

43 

43  Regular 

39 

39  Regular 

18 

18  Post-Alert 

Dice 

Training 

Dice 

Training 

Dice 

Training 

Rolls 

Sorties 

Rolls 

Sorties 

Rolls 

Sorties 

13 

12 

n 

13 

11 

13 

11 

11 

19  * 

9 

11 

12 

10 

10 

10 

9 

11 

12 

9 

10 

9 

8 

10 

11 

9 

10 

9 

8 

10 

9 

9 

8 

9 

8 

7 

8 

8 

8 

9 

8 

7 

7 

8 

7 

9 

7 

6 

7 

8 

7 

9 

7 

5 

7 

7 

7 

8 

7 

5 

7 

7 

7 

8 

7 

5 

6 

7 

7 

8 

7 

4 

6 

7 

7 

7 

7 

4 

4 

6 

7 

7 

b 

4 

4 

6 

7 

7 

6 

4 

3 

6 

6 

6 

6 

2 

3 

6 

6 

6 

6 

1 

3 

6 

6 

6 

5 

2 

6 

6 

6 

5 

5 

6 

6 

5 

5 

5 

5 

5 

5 

5 

5 

5 

5 

5 

5 

4 

5 

5 

5 

4 

5 

5 

4 

4 

5 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

3 

4 

4 

4 

3 

4 

3 

4 

3 

3 

3 

3 

3 

3 

3 

2 

3 

3 

2 

2 

2 

3 

2 

2 

2 

3 

2 

2 

2 

2 

1 

2 

1 

2 

1 

1 

1 

1 

1 

1 

1 

1 

1 

0 

0 

0 

0 

Table  2 

ARITHMETIC  MEANS  OF  THE  DATA 
SHOWN  IN  TABLE  1 


Write-Ups/Sortie 


Dice  Aircraft 


5.21  5.14 
5.90  5.23 
5.94  6.89 


show  the  frequency  with  which  sorties  resulted  in  13  write-up?,  12, 
11,  etc.,  down  to  0.  Thus  Table  3  suggests  that  it  would  be  prudent 
to  "wait  one"  before  going  gung-ho  on  the  assumption  that  alert 


Table  3 

FREQUENCY  DISTRIBUTIONS  OF  THE  DATA  IN  TABLE  1 


Frequency 


Number  of 
Write-ups 


Dice  Sample 


1 

1 

3 

3 

4 
6 
7 
6 
1 

5 
3 
2 

224 

43 

5.21 


2 

3 

1 

1 

1 

1 

3 

1 

1 

6 

2 

3 

3 

2 

3 

2  . 

8 

5 

1 

5 

4 

3 

5 

7 

4 

4 

5 

5 

2 

1 

3 

1 

1 

4 

1 

2 

230 

107 

221 

39 

18 

43 

5.90 

5.94 

5.14 

Aircraft  Sample 


2 


204 

39 

5.23 


NOTE: 


tie  yielded  13  write-ups,  2  sorties  yielded  12,  one  sortie 
yielded  11,  etc.  £  *  total  write-ups;  N  ■  number  of  sor¬ 
ties;  M  =  mean. 
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aircraft  are  yielding  more  write-ups.  While  it  is  true  that  aircraft 
just  off  alert  show  more  write-ups,  on  an  average,  than  do  the  regular 
flyers,  this  difference  apparently  is  all  but  lost  in  the  variability: 
the  distributions  of  the  flyers  scatter  just  like  the  distributions 
of  the  dice  (i.e.,  from  0  to  13  write-ups).  The  question  arises:  "Is 
the  scatter  among  the  flyers  random  (as  is  the  scatter  of  the  dice), 
or  is  there  a  difference:  is  there  reason  to  presume  that  the  scatter 

among  the  flyers  may  not  be  random?1' 

The  chi-square  test  is  one  method  of  checking  for  randomness,  i.e., 
deciding  whether  differences  among  flyers  can  be  attributed  to  chance. 
The  following  calculations  are  for  the  dice  samples  (where  we  know 
the  answer) : 


(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

Base  Line 

F 

F 

■ 

CM 

1*4 

1 

1*4 

(W2 

F 

Sample 

("Sorties") 

o 

t 

tfifl 

'  o  t 

t 

1 

43 

El 

241.2 

295.8 

1.23 

2 

39 

m 

218.8 

125.4 

0.57 

3 

18 

107 

36.0 

0.36 

Total 

100 

561 

561.0 

H 

X2  -  2.16 

NOTE:  F  stands  for  observed  frequency,  F  for  theoretical 

frequency?  *  chi-square  ■  2.16. 


Details  of  the  calculations  follow.  The  first  43  "sorties"  pro¬ 
duced  224  write-ups  out  of  the  total  of  561.  Theoretically,  since 
there  were  100  sorties,  we  would  have  expected  the  43  sorties  to 
produce  (43/100)x(561)  -  241.2  write-ups.  Similarly: 

(39/100)x(561)  -  218.8 
(18/100)x(561)  =  101.0 

If  the  computation  is  correct,  the  sums  of  the  observed  (Fq) 
and  theoretical  frequencies  (Ft)  will  agree  within  rounding  error  (561 
and  561.0). 
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Column  5  tabulates  the  difference  between  F  and  F  :  224  -  241.2  ■ 

0  2  t 

-17.2.  This  difference  is  squared  in  Col.  6:  (-17.2)  ■  295.8,  which 

is  then  divided  by  F  to  give  Col.  7:  295.8/241.2  -  1.23.  All  the 
items  in  Col.  7  are  added  to  get  the  chi-square:  1.23  +  0.57  +  0.36  = 
2.16. 

To  interpret  the  significance  of  the  chi-square  we  need  one 
additional  bit  of  information:  the  degrees  of  freedom  (df )  .  For 
the  one-sample  chi-square  method,  the  df  is  always  one  less  than  the 
number  of  categories.  In  this  instance,  since  we  are  working  with 
three  categories  (samples), 

df  -  3  -  1  -  2. 

We  then  enter  a  table  of  significance  levels  for  the  chi-square 
distribution,  at  the  row  for  df  ■  2 : 


Critical  Values  of  Chi-Square 


NOTE:  See  chi-square  tables,  Appendix  B.  Dots  here 

indicate  data  that  are  irrelevant  for  our  immediate  use. 

Each  line  differs  according  to  df,  and  in  our  case  of  2  df 
the  second  line  in  the  table  applies. 

The  numbers  across  the  top  of  the  table  (.99,  .  .  .,.50,  .  .  .) 
are  called  probability  levels.  These  are  the  probabilities  that  a 
chi-square  will  be  higher  than  the  critical  value  shown  down  its 
column  at  the  applicable  df  line.  The  probability  level  which  is 
selected  for  a  particular  chi-square  test  is  called  the  level  of 
significance ;  all  computed  chi-square  values  greater  than  this  critical 


value  fall  into  the  region  of  rejection.  For  example,  the  critical 
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value  for  the  .05  significance  level  is  5.99,  and  all  computed  chi- 

square  values  for  2  df  higher  than  5  .99  fall  into  the  region  of  rejection 

for  this  particular  chi-square  test. 

Our  chi-square,  2.16,  is  between  the  critical  values  associated 

with  the  .50  and  .30  probability  levels  for  2  df.  This  means  that  of 

all  computed  chi-squares  based  on  sets  of  rolls  of  dice  (where  each 

set  consists  of  43,  39,  and  18  rolls  respectively)  more  than  30  to 

50  of  every  100  such  sets  would  show  a  higher  competed  chi-square  value. 

We  can  therefore  say  that  chance  accounts  for  the  variation  we  found, 

since  it  is  customary  to  reject  the  hypothesis  of  homogeneity  --  i.e., 

2 

random  variation  --  only  if  the  computed  value  of  y  falls  beyond  the 
.05  significance  level  for  the  appropriate  df.  This  choice  of  .05  is 
arbitrary;  .01  could  have  been  used  and  often  is. 

To  recapitulate  : 

1.  We  collected  some  data:  three  samples  of  "write-ups." 

2.  We  determined  a  theoretical  distribution  by  apportioning  our 
observations  in  terms  of  sorties  flown. 

3.  We  worked  some  simple  arithmetic  based  on  the  formula: 


2  <VFt>' 


(The  Greek  capital  sigma  (£)  indicates  a  summation, 
i.e.,  that  the  numbers  following  should  be  added  up.) 

4.  Finally,  we  checked  the  significance  of  our  chi-square 
with  a  table  of  chi-square  probabilities  to  see  if  the 
numbers  were  getting  out  of  the  range  of  normal  vari¬ 
ation. 

Now  that  we  have  confirmed  that  the  variations  in  the  three 
samples  of  dice  results  are,  in  fact,  random,  let  us  look  at  the 
B-47  data  of  Table  1. 


Samples 

Sorties 

Write-Ups 

Ft 

F  -Ft 
o  t 

<VFt>2 

<VFt>2 

Ft 

1 

43 

236.1 

-15.1 

228 

2 

39 

214.1 

-10.1 

102 

0.48 

3 

18 

124 

98.8 

+25.2 

635 

6.42 

Total 

100 

549 

549.0 

X2  -  7.86 

df  »  2 
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NOTE:  The  computations  are  the  same  as  before: 

43/100  x  549  -  236.1 

39/100  x  549  -  214.1 

18/100  x  549  -  98.8 

2 

Entering  the  X  table,  we  find  (for  2  df): 


Probabilities 

df 

.05 

CN 

O 

• 

.01 

1 

2 

•  •  • 

5.99 

•  i  i 

7.82 

•  i  i 

9.21 

3 

t  •  • 

•  •  • 

•  •  • 

There  are  only  2  chances  in  ICO  of  getting  a  X  as  large  as  we 
get  (7.86).  Consequently,  we  strongly  suspect  that  something  other 
than  chance  has  been  at  work.  Checking  the  computation,  we  find  that 
the  biggest  contributor  to  the  y2 3  is  the  third  (Alert)  sample  and 
that  the  observed  frequency  is  much  greater  than  the  expected  (or 
theoretical)  frequency.  The  aircraft  are  affected  (the  dice  are 
"loaded"),  apparently,  by  their  stay  in  alert. 

Before  we  consider  a  more  elegant  problem,  some  comment  on  chi- 
square  analysis  is  in  order. 

1.  Because  the  numbers  are  easy  to  get  (simple  sorting  and 
tabulation),  and  the  computations  simple,  chi-square  is 
an  exceedingly  economical  tool. 

2.  Chi-square  does  not  "prove"  or  disprove  anything.  It 
only  indicates  whether  the  numbers  can  be  explained  by 
chance  or  not.  Note:  the  chi-square  did  not  prove  that 
Alert  status  caused  malfunctions;  it  only  indicated  that 
the  differences  among  the  numbers  were  probably  not  due 
to  chance.  The  difference  could  result  from  dirty  data 
or  faulty  sampling;  it  might  even  be  that  flight  crews 
are  more  critical  about  aircraft  just  off  alert  and 
therefore  write  up  more  things.  What  chi-square  has 
done,  rather,  is  suggest  a  profitable  arpa  for  further 
exploration  (i.e.,  the  effects  of  standing  in  alert). 

This  can  be  of  considerable  importance  to  the  short- 
staffed  maintenance  group,  who  must  limit  their  efforts 
to  areas  with  potential  payoff. 

3.  One  cannot  get  into  too  much  trouble,  if  one  remembers 

a.  The  sum  of  F  and  F  must  agree  (within  rounding 
\  o  t 

errors) . 


-9- 


b.  The  Ft  should  be  greater  than  5  In  each  cell  (preferably 
greater  than  10).  (This  is  a  requirement  to  meet  certain 
mathematical  conditions  on  which  this  procedure  is  based)  . 

c.  That  chi-squure  only  Interprets  the  data  given  to  it. 

4.  In  the  statistics  books  the  formula  is  generally  written : 


2 

-  o  t 

x  ■  £  — 


Ft  * 


and  also: 


2 

X 


r  izzf. 

e 


* 


in  which  "e"  represents  "expected  or  theoretical  frequency", 
and  "o"  represents  "observed  frequency." 

By  way  of  review  of  the  chi-square  exercise,  let  us  take  another 
conventional  maintenance  question:  Do  we  have  any  "dogs"  or  "hangar 
queens"?  From  last  week's  maintenance  data  collection  (MDC)  reports, 
we  isolate  the  information  shown  in  the  table  below: 


Tail  Number 

Work  Units 

Man-Hours 

248 

6 

135 

16 

21 

186 

22 

261 

32 

285 

36 

37 

187 

51 

210 

58 

From  the  looks  of  this,  the  big  troublemakers  are  Tail  Numbers 
187  and  210,  while  the  friends  of  maintenance  are  248,  135,  426,  and 
186.  But  let  us  look  at  these  same  numbers  statistically  and  in 
terms  of  numbers  of  sorties,  as  we  did  before.  Here  is  the  chi-square 
computation  for  work  units: 
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Tail 

Number 

Sorties 

Flown 

Work-Units 

n 

<F.*Ft> 

Means 

(Observed  Work- 
Units/Sortie) 

F 

0 

Ft 

Ft 

243 

2 

6 

16.91 

7.04 

3.00  (-) 

135 

1 

16 

8.45 

6.73 

16.00 

426 

3 

21 

25.36 

0.75 

7.00  (-) 

186 

3 

22 

25.36 

0.45 

7.33  (-) 

261 

5 

32 

42.27 

2.50 

6.40  (-) 

285 

4 

36 

33.82 

0.14 

9.00 

403 

4 

37 

33.82 

0.30 

9.25 

187 

3 

51 

25.36 

25.91 

17.00 

210 

8 

58 

67.64 

1.37 

7.25  (-) 

Total 

IT 

279 

278.99 

GM  -  8.45 

NOTE:  The  minus  signs  in  the  "Means"  column  tag 
those  aircraft  that  are  below  the  fleet  grand  mean  of  8.45 
observed  work-units  per  sortie  flown. 


Examination  of  the  "critical  values  of  chi-square"  table.  Appendix 
2 

B,  shows  that  our  x  45.19  for  8  df  is  far  beyond  the  .001  level. 

This  certainly  is  not  random  variation.  A  close  examination  of  the 
computations  reveals  an  entirely  different  picture  from  that  in  the 
table  of  MDC  data  above.  Tail  Number  210,  rather  than  being  the  "dog," 
turns  out  to  be  a  "good  bird."  As  a  matter  of  fact,  it  has  produced 
a  below-average  number  of  work  units  per  sortie  flown.  The  real  problem 
is  Tail  Number  187  .  Besides  producing  double  the  average  number  of 
work  units,  its  contribution  to  the  chi-square  summation  is  far  beyond 
bounds:  the  .05  level  for  8  df  is  15.51  and  No.  187  alone  contributes 
25.36.  Number  ’35  is  the  other  aircraft  that  is  far  above  the  mean. 

We  will  want  to  keep  an  eye  on  it:  if  it  keeps  up  at  the  late  it  is 
going,  it  will  be  in  the  same  class  as  No.  187. 

The  examples  thus  far  have  been  simplified  to  ease  the  learning 
process.  Let  us  look  now  at  a  typical  real-life  computation. 

The  samples  of  write-ups  used  for  the  first  illustration  were 
taken  from  a  larger  sample  obtained  from  one  of  the  SAC-47  bases 
(15th  AF)  .  The  complete  set  of  data  and  computations  are: 
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Fly  Cycle 
Number 

Sortie 

Count 

F 

o 

(Write-Ups) 

Ft 

d2/f^ 

w 

Mean  Write-U 
per  Sortie 

1st 

43 

221 

235.25 

0.86 

5 . 14  (  -  ) 

2d 

45 

262 

246.20 

1.01 

5.82 

3d 

46 

210 

251.67 

6.90 

4.57(-) 

4  th 

39 

204 

213.37 

0.41 

5.23(-) 

5th 

29 

143 

158.66 

1.55 

4.90(-) 

6th 

23 

135 

125.83 

0.67 

5.87 

AA 

18 

124 

98.48 

6.61 

6.89 

AP 

16 

118 

87.54 

10.60 

7.38 

I  =  259 

Z  *1417 

1417.00 

x2  28.62 
df  -7 

GM-  5.47 

NOTE:  The  cycle  numbers  refer  to  the  monthly  flying  cycles; 
i.e.,  the  1st,  2d,  3d  time  an  aircraft  flew  during  a  month.  AA  is 


the  first  sortie  flown  after  alert.  AP  is  the  first  sortie  after  a 
periodic.  The  AA  and  AP  counts  are  not  included  in  the  other  counts. 

D  /F  represents  the  differences  (between  F  and  F  )  squared,  divided 
by  f|.  o  t 


The  table  of  chi-squares  (see  the  7  df  row)  shows  that  the 
2 

computed  value  of  y.  is  beyond  the  0.001  significance  level.  Thus, 
we  reject  the  hypothesis  that  the  differences  among  the  eight  measures 
are  due  to  chance  variation:  there  is  less  than  one  chance  in  1000 
of  getting  a  chi-square  this  large. 

The  mean  write-ups  per  sortie  are  included  to  further  the  com¬ 
parison  (e.g.,  the  first  sample  consisted  of  43  sorties,  producing 
221  write-ups,  for  an  average  of  221/43  ■  5.14  write-ups  per  sortie). 
The  entire  study  involved  259  sorties  producing  1417  write-ups  for  a 
grand  mean  of  5.47  (1417/259  *=  5.47).  Again,  the  minus  signs  tag 

those  sample  means  that  are  smaller  than  the  grand  mean. 

2 

Checking  the  x  column,  we  note  that  the  3d  cycle,  AA,  and 
AP  are  the  biggest  contributors  to  the  chi-square.  Cross-checking 
with  the  Means  column,  we  find  that  the  mean  for  the  3d  cycle  is 
smaller  than  the  grand  mean,  and  those  for  AA  and  AP  are  much  larger. 
Thus,  we  get  a  clearer  picture  of  the  numbers:  the  chi-square  indicated 
that  the  difference  among  the  eight  samples  was  not  due  to  chance; 
and  by  cross-checking  with  the  chi-square  and  Means  columns,  we  begin 
to  get  some  hunches  about  which  sets  of  data  are  abnormal.  We  have 
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an  unanswered  question:  "Is  something  peculiar  about  Cycle  3Y"  It 
produces  far  fewer  write-ups  than  expected.  We  would  like  to  know 
what  caused  that. 

Note  well,  however,  that  the  chi-square  test  told  us  only  that 
the  differences  among  all  eight  samples  were  due  to  something  other 
than  chance.  We  cannot  say,  specifically,  that  Cycles  3,  AA,  and 
AP  are  the  loaded  dice,  but  they  are  good  candidates  for  a  more 
detailed  investigation. 

Our  conclusion,  based  on  the  test,  is  that  there  is  strong  reason 
to  suspect  that  post-alert  and  post-periodic  sorties  result  in  more 
write-ups  than  do  the  regular  flyers.  This  is  an  interesting  finding, 
but  an  equally  interesting  one  would  be  the  answer  to  the  question: 

"Do  aircraft  just  off  alert  have  discrepancies  that  cause  more  frequent 
loss  of  mission  than  do  the  regular  flyers?"  We  could  dig  into  the 
SAC  form  127  and  the  SAC  debriefing  forms  file  for  two  sets  of  data 
(the  "0"  and  "1"  code  of  Col.  14,  which  would  yield  a  sample  of 
regular  flyers  and  AA's).  Then  we  could  sum  the  training  items 
scheduled  (Cols.  17-21)  and  the  training  items  lost  due  to  mal¬ 
function  (Cols.  41-42),  and  perform  the  test  almost  as  before. 

Before  we  can  calculate  the  chi-square  for  these  data,  we 
must  learn  one  more  thing.  Until  now  we  have  obtained  frequencies 
of  write-ups,  work  units,  etc.,  for  each  sortie,  and  then  summed 
all  the  frequencies  to  get  the  total  of  frequencies  in  the  category. 

In  the  next  few  examples,  we  will  discuss  an  interesting,  unique 
case  where  each  observation  can  fall  into  one  of  two  classes  -- 
success  or  failure  --  and  we  will  get  the  observed  and  theoretical 
frequency  in  each  class.  Data  where  each  observation  is  grouped 
according  to  success  or  failure,  and  the  total  frequency  of  successes 
and  failures  for  all  observations  is  listed,  are  called  binomial 
distribution  data.  When  the  data  are  binomially  distributed  (as 
they  will  be  in  the  next  few  examples)  the  observed  and  theoretical 
frequencies  of  both  successes  and  failures  must  be  included  in  the 
chi-square  computations,  as  will  be  uhown.  Binomial  distributions 
frequently  encountered  are: 
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Heads  or  tails; 

Yes  or  no; 

Broken  or  unbroken; 

Write-ups  or  no  write-ups; 

Zero  or  one; 

Failed  or  passed; 

Successful  or  unsuccessful; 
and  so  on. 

The  computation  for  binomially  distributed  data  is  illustrated. 
Note  that  it  requires  only  slight  further  computation. 


First  lay  out  and  compute  as  before: 


Class 

Items 

Scheduled 

F 

0 

Successful 

Ft 

Successful 

2 

DZ/Ft 

Regular 

697 

633 

622.6 

0.17 

Post-Alert 

62 

45 

55.4 

1.95 

759 

678 

678.0 

NOTE:  As  before,  678/759  x  697  =  622.6,  etc. 


Then  arrive  at  the  nonoccurrence  Fq  +  Ffc  by  subtraction: 
697  -  633  =  64,  and  62  -  45  =  17. 


Class 

Items 

Scheduled 

F 

Unsuccessful 

Ft 

Unsuccessful 

D2/Ft 

Regular 

697 

64 

74.4 

1.45 

Post-Alert 

62 

17 

6.6 

16.39 

759 

81 

81.0 

2 

Combine  D  /F  . 

0.17  +  1.45  -  1.62 

1.95  +16.39  =  18.34 

x2  =  19.96 
df  =  1 


The  chi-square  is  highly  significant:  a  check  of  the  chi-square 
table  shows  that  it  is  well  beyond  the  0.001  level.  Our  alert  air¬ 
craft  are  contributing  an  undue  share  (17  items  lost  versus  an  expected 
6.6). 
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Let  us  suppose  that  the  findings  did  not  come  out  so  neatly. 

2 

Assume  that  \  came  out  1.72  (instead  of  19  .96).  A  check  with  the 

chi-square  tables  (  for  1  df)  would  show  that  the  probability  of 

getting  a  chi-square  as  large  as  or  larger  than  that  observed  lies 

between  0.20  and  0.10.  In  other  words,  there  are  about  15  chances 

2 

in  100  of  getting  a  x  this  big  or  bigger  --  that  is,  of  having  obser¬ 
vations  differ  from  theoretical  values  by  as  much  as  or  more  than 
actually  occurred  even  if  regular  and  post-alert  situations  are  the 
same;  and  therefore  the  differences  occurred  by  chance.  This  proba¬ 
bility  (0.15)  is  in  the  messy,  in-between  area.  One  cannot  say  that 
it  is  low  enough  (i.e.,  p  ■  0.05)  to  reject  a  null  hypothesis  (meaning, 
in  this  case,  that  the  difference  is  due  to  chance).  But,  even  though 
the  direction  is  the  same  as  that  of  the  previous  data,  we  cannot  be 
sure  there  is  a  significant  difference.  Several  alternatives  are 
possible,  however; 

First,  we  can  assume  there  is  nothing  sacred  about  an  0.05  or 
0.01  probability  level,  which  is  a  matter  of  convention.  If  the 
situation  is  critical  --  for  example,  if  it  might  affect  a  unit's 
showing  in  an  Operational  Readiness  Inspection  (0RI)  --  we  probably 
would  take  steps  to  see  that  things  did  not  get  worse. 

Second,  if  the  matter  is  not  critical,  we  might  elect  to  keep 
it  under  close  surveillance,  checking  to  see  if  the  data-trend  con¬ 
tinues  to  deteriorate. 

Third,  we  might  obtain  an  entirely  different  set  of  numbers 
(say  from  last  quarter,  or  from  some  other  base)  and  run  it  again. 

While  we  are  sorting  the  SAC  Form  127,  we  would  probably  wish 
to  pick  up  data  on  three  other  critical  measures  to  determine  whether 
a  stay  in  alert  affects: 

1.  The  number  of  late  takeoffs  due  to  materiel. 

2.  The  number  of  cancellations  due  to  materiel. 

3.  The  EW0  effectiveness. 

Each  of  these  is  a  critical  measure  that  might  be  influenced  by 
standing  in  home  alert.  The  chi-square  testing  would  be  identical  to 
that  in  the  previous  example:  two  categories  (1  df)  using  the  "0" 
and  "1"  codes.  Since  these  are  all  binomial  distributions,  the  non¬ 
occurrence  items  must  be  included  in  the  computation. 
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The  examples  have  been  taken  from  the  1962  SAC  126  form;  the  ncv 
SAC  form  works  out  even  better  for  SAC  analyses.  Late  takeoffs, 
cancellations,  first  sorties  of  the  day,  first  sorties  after  ground 
alert,  and  other  data  (Cols.  17-31)  are  processed  as  described.  In 
addition,  the  Information  from  Cols.  32-51  provides  an  excellent  set 
of  data  to  monitor  for  possible  "sick’1  systems.  Ihe  Code  1  counts 
(system  used  and  satisfactory)  become  the  Base-Line,  and  the  2,3, 
and  4  codes  (system  used  and  not  satisfactory),  combined,  become  the 
observed  frequencies  (l.e.,  write-ups). 

The  new  ADC  mechanized  debriefing  form  (76-3)  lends  Itself  equally 
well  to  statistical  analysis.  The  sortie  cards  yield  the  write-up 
counts  (by  type  of  mission  If  desired)  as  well  as  verification  of 
armament  connections.  However,  to  get  Intercept  effectiveness,  some 
minor  PCAM  operation  Is  necessary,  ftie  easiest  way  Is  to  punch  the 
tall  number  In  Cols.  78-80  on  all  cards.  This  enables  one  to  Isolate 
data  by  sortie  and  serial  number.  With  this,  some  really  elegant 
analyses  are  available:  which  tail-numbers  have  out-of-bounds  kill 
rates  under  what  circumstances;  what  situations  produce  the  worst  and 
th  *.  best  kill  ratios;  what  flight  crews  need  special  training  In  which 
situations,  and  so  on. 

One  of  the  advantages  of  using  chi-square  analysis  Is  that  It 
provides  a  means  of  directing  the  resources  of  R6A  to  payoff  areas. 
Thousands  of  such  investigations  could  be  undertaken  by  R6A,  but 
many  would  end  up  as  complete  wastes  of  effort.  With  chi-square, 
we  can  make  a  quick,  cheap  test  to  determine  if  a  more  elegant  Investi¬ 
gation  will  be  profitable.  The  following  is  a  partial  list  of  things 
that  might  be  looked  at: 

Comparison  of  experienced  and  inexperienced  flight  crews 

Comparison  of  experienced  and  inexperienced  maintenance  personnel 

The  effects  of  long  versus  short  training  missions 

First  25  hours  versus  last  25  hours  since  the  100-hour  periodic 

Yo-Yo  versus  non-Yo-Yo  xiights 

Low-altitude  versus  hlgh-altitude  sorties 

Reciprocating  versus  jet  engines 

Winter  versus  summer 
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Regular  sorties  versus  special  sorties 

Morning  versus  evening  flyers. 

*•  st  of  these  are  general,  however.  Let  us  look  at  something 
more  specific:  getting  leverage  on  the  SAC  and  TAC  Management  and 
Control  System  (MCS)  data.  (Note  that  we  have  already  touched  on 
several  MCS  measures:  late  takeoffs,  cancellations,  training  loss, 
and  materiel  defects.) 

The  testing  of  MCS  data  is  exactly  the  same  as  before;  in  addi¬ 
tion,  we  will  keep  pace  with  progress  by  using  time-sampling  techniques. 

The  intent  is  to  catch  any  odious  trend  before  it  is  too  late. 

MCS  scoring  is  comprised  of  two  parts  :  che  Base-Line  and  a 
related  measure  (i.e.,  the  observed  frequency): 

1.  Manning  in  Required  Specialties: 

Base-Line:  Total  Requirement 
Fq:  Total  Assigned 

(Officers  and  Airmen  kept  separate) 

2.  Individual  Proficiency  Training: 

Base-Line:  Total  Eligible 
F  :  Total  in  Training 

Base-Line:  Number  Testing 
F  :  Number  Passing 

e?c . 

The  chi-square  test  can  also  be  used  to  monitor  for  out-of- 
bounds  conditions.  In  practice,  the  monitoring  agency  (R&A)  estab¬ 
lishes  a  suitable  time -sample  (every  week,  two  weeks,  month,  etc.) 
and  determines  the  number  of  samples  to  process,  (e.g.,  four  con¬ 
secutive  two-week  samples). 

For  example,  assume  that  Shop  Reparable  Performance  is  to  be 
monitored.  It  is  decided  that  four  consecutive  two-week  samples 
will  be  used  in  the  computations.  The  Base-Line  is  Items  Processed. 

The  F0  is  Items  Repaired.  These  are  binomially  distributed  data,  so 
the  nonoccurrence  (nonrepaired)  counts  must  be  included.  The  compu¬ 
tations  are : 
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Month 

Base -Line 

F 

o 

F 

t 

2  . 

D  /F 

Means 

July 

811 

721 

665.60 

25.72 

0.889 

941 

762 

772.29 

0.76 

(-)0 .810 

August 

763 

628 

626.20 

0.03 

0.823 

932 

718 

2829 

764.91 
2829 .00 

7  16.04 
y  =42.55 
df-3 

f -)0 .770 
GM=0 .821 

Note:  D  /Ft  column  includes  occurrences  and  nonoccurrences. 

In  the  example,  the  observed  chi-square  was  42.55  ,  which  is  well 

2 

beyond  the  .001  significance  level  for  y  with  3  df,  and  the  last 
sample  in  August  showed  an  unfortunate  downward  trend.  Since  this  is 
MCS ,  it  is  critical.  We  want  to  act. 

Assume  that  steps  were  taken  to  improve  the  shop  reparable  per¬ 
formance.  The  method  of  handling  the  resulting  data  is  °hown  below. 
The  first  two  weeks  data  are  dropped,  and  the  new  data  are  added  on 
the  end.  The  chi-square  is  recomputed: 


Month 

Base-Line 

F 

° 

Ft 

D2/Ft 

Means 

July 

941 

762 

760.25 

0.02 

(-)0 .810 

August 

763 

628 

616.44 

1.13 

0.823 

932 

718 

752.98 

8.46 

(-)0 .770 

September 

826 

689 

2797 

667.34 

2797.01 

7  3.66 

x  =13.27 
df*3 

0.834 
GM=0 .808 

The  numbers  have  now  taken  a  trend  back  up.  The  September  data 
show  Shop  Reparable  to  have  a  "batting  average"  of  0.834,  while  the 
over-all  "batting  average"  is  0.808  (the  Grand  Mean). 

With  the  August  data  removed,  the  numbers  show  only  conventional 
fluctuations  of  random  events;  hence,  reparable  performance  is  in  a 
steady-state  condition.  Note  that  chi-square  does  not  (and  cannot) 
say  whether  an  over -all  batting  average  of  0.808  is  good  enough  for 
the  base  reparable  performance. 

The  second  method  of  monitoring  MCS  data  has  already  been  implied : 
determining  the  effects  of  some  corrective  program.  Data  are  divided 
into  two  categories,  "Before"  and  "After".  The  question  to  be  asked 
is:  "Did  the  new  maintenance  program  significantly  improve  performance?" 
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Ramember  Chat  any  data  can  show  a  change  for  the  better,  but  this 
change  may  be  only  a  random  fluctuation  --  that  Is,  no  real  change  at 
all.  The  DCM  needs  to  know  whether  the  change  is  significant  or  only 
random  variation  (as  seen  In  the  last  example). 

The  rationale  of  the  previous  paragraph  follows : 

a.  If  the  upward  trend  is  only  a  random  fluctuation,  the 
DCM  needs  to  know  this  so  that  further  measures  can  be 
made  to  insure  that  the  trend  is  truly  upward. 

b.  Or ,  if  the  upward  trend  is  only  random  variation,  and 
the  cost  of  the  new  procedures  is  great,  the  DCM  may 
wish  to  go  back  to  the  old  procedures  that  gave  the 
same  results  at  lower  cost.  More  likely,  he  may  wish 
to  revise  the  procedures  again.  It  is  worthwhile  to 
repeat  the  point :  the  mere  movement  of  up  or  down  does 
not  necessarily  mean  either  improvement  or  regression; 
the  movement  may  be  due  to  random  fluctuation.  The 
basic  question  is  :  "Is  the  upturn  outside  the  limits 
of  chance --has  there  been  a  true  improvement?" 

An  example  of  testing  for  improvement  follows:  Inadequate  Shop 
Reparable  performance  provoked  the  establishment  of  a  special  set  of 
procedures  to  improve  performance.  The  "Before"  and  "After"  data  are 
tested.  The  Base-Line  is  total  items  processed,  and  Fq  is  total  items 
repaired . 


Time 

Base-Line 

F 

0 

Ft 

D2/Ft 

Means 

Before 

1695 

1346 

1366.03 

1.51 

(-)0 .794 

After 

1752 

1432 

1411.97 

,  1.46 

0.817 

3447 

2778 

2778  .00 

y“=2  .98 
df*l 

GM=0 .806 

Here  is  a  good  illustration  of  how  chi-square  can  be  helpful. 
Apparently  the  new  set  of  procedures  has  yielded  an  improvement:  the 
number  of  items  processed  is  up  (1346  to  1432),  and  the  "batting 
average"  is  up  (0.79  to  0.82).  But,  as  the  chi-square  shows,  there 
is  no  cause  for  optimism:  the  probability  is  approximately  0.10. 

We  cannot  say  certainly  that  the  new  set  of  procedures  has  had  any 
effect.  Some  re-evaluation  of  the  situation  is  in  order. 

This  concludes  the  discussion  of  the  use  of  chi-square  with  the 
one-way  case.  Before  going  on  to  the  two-way  case,  we  may  profitably 
repeat  certain  points: 
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1.  Numbers  vary  as  part  of  the  normal,  random  fluctuation. 
Consequently,  they  can  make  false  impressions.  Chi-square 
minimizes  this  false  impression  by  providing  a  test  for 
homogeneity . 

2.  The  chi-square  test  is  easily  computed  and  hence  is  cheap 
to  use.  (Most  of  the  examples  used  in  this  discussion  were 
extracted  from  regular  records.) 

3.  A  check  should  always  be  made  to  insure  that  the  sums  of 
the  F0  and  Ft  agree  within  rounding  errors,  and  that  the 
F  are  greater  than  5.0  in  each  cell. 

4.  If  the  data  are  in  the  form  of  binomial  distributions,  the 
nonoccurrence  items  must  be  included  in  the  computation. 

5.  The  data  must  be  in  the  form  of  frequency  counts. 

6.  Chi-square  tests  only  the  numbers  offered  to  it,  not  the 
reasoning  behind  the  selection  of  the  numbers. 
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III.  CHI-SQUARE:  THE  TWO-WAY  CASE 

Ttie  one-way  use  of  the  chi-square  was  characterized  by  use  of  a 
Base-Line  and  one  set  of  observations  (the  Fq) .  The  two-way  case 
uses  no  Base-Line,  and  two  or  more  sets  of  observed  frequencies.  The 
absence  of  a  Base-Line  restricts  the  possible  applications,  for  reasons 
to  be  discussed  later.  But  where  it  can  be  used,  the  two-way  case  is 
an  elegant  means  of  testing  a  broad  array  of  data  to  find  out  if  some¬ 
thing  is  getting  out  of  balance. 

For  illustration,  let  us  test  the  classic  belief  that  maintenance 
troubles  come  in  cycles:  "This  month,  it  is  Bomb-Nav,  last  month  it 
was  ..." 

As  before,  a  chi-square  is  computed  and  checked  to  see  what 
probability  level  it  represents.  The  computational  process  is  slightly 
different.  The  observations  (write-ups)  are  first  set  up  as  shown  in 
Table  j  . 


Table  5 

ORGANIZATION  OF  WORKSHEET  FOR  TWO-WAY  COMPUTATIONS 


System 

Write-Ups 

July 

August 

September 

Row  Total 

Communication 

272 

243 

231 

746 

Navigation 

208 

225 

218 

651 

ECM 

286 

302 

316 

904 

Bomb-Nav 

407 

388 

411 

1206 

Autopilot 

68 

76 

55 

199 

Column  Total 

1241 

1234 

1231 

3706 

In  computing,  we  first  sum  both  the  rows  and  columns  to  get  the 
margin  totals.  As  a  check,  we  get  the  matrix  total  twice,  by  summing 
both  the  row  and  the  column  margin  totals. 

The  theoretical  frequency  (F^)  for  each  cell  is  computed: 

F  =  (Col.  Total)  x  (Row  Total) /(Matrix  Total); 


1 
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thus  the  July  Bomb-Nav  F  ts 

F  -  (1241)  x  (1206) /(3706)  =  403.84. 

With  a  desk  calculator,  a  lot  of  the  tedium  can  be  circumvented  by 
first  obtaining  (for  each  column)  the  constant: 

k  =  (Col.  Total) /(Matrix  Total), 

and  then  multiplying  each  Row  Total  by  the  constant. 

The  chi-square  as  before  is  : 


In  each  cell,  F  is  subtracted  from  the  corresponding  Fq;  the 
difference  is  squared  and  divided  by  Ffc .  As  befoi  these  results 
are  summed  to  get  chi-square. 

A  complete  picture  of  what  has  happened  can  be  obtained  by  re- 

2 

cording  three  entries  in  each  cell:  the  F  ,  the  F  ,  and  the  D  /F  . 

2  o  t  t 

In  addition,  when  the  y  is  significantly  different  from  chance,  one 

2 

should  locate  those  cells  with  large  D  /Ft  and  put  a  "+"  or  in 
front  of  the  Fq  that  are  bigger  or  smaller  than  the  F^..  This  gives 
us  some  "hunch  material"  for  use  when  we  start  digging  deeper. 

Table  6  is  the  completed  computation  of  the  previous  data. 

The  degree -of -freedom  computation  is: 

df  =  (R  -  1)  x  (C  -  1). 

In  Table  6,  we  have  5  rows  and  3  columns;  hence: 

df  =  (5  -  1)  x  (3  -  1)  =  8. 

2 

In  this  example,  the  \  is  at  the  probability  level: 

.50  >  p  >  .30. 


s 

\ 


€ 
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Table  6 

COMPLETE  COMPUTATION  FOR  DATA  GIVEN  IN  TABLE  5 


System 

Entry 

July 

Augus  t 

September 

Communication 

F 

0 

272.00 

(-)243 .00 

(->231.00 

Ft 

249.81 

248.40 

247.79 

D2  /F 

1.97 

0.12 

1.14 

Navigation 

F 

o 

(-)208  .00 

225  .00 

218.00 

Ft 

218.00 

216.76 

216.24 

•  -J 

D2/Ft 

0.46 

0.31 

0.01 

ECM 

F 

o 

( - ) 286 .00 

302.00 

316.00 

Ft 

302.72 

301.01 

300.28 

D2/Ft 

0.92 

0.00 

.  0.82 

Bomb /Nav 

F 

o 

407.00 

(-)388  .00 

411.00 

Ft 

403.84 

401.57 

1 

400.59 

D2/Ft 

0.02 

0.46 

0.27 

Autopilot 

F 

o 

68  .00 

76.00 

(->55.00 

Ft 

66 .64 

66.26 

66.10 

D2/Ft 

0.03 

1.43 

1.86 

Chi-square  =  9.82 
df  =  8 

Hence,  we  conclude  that  the  system  is  showing  the  normal  variation  of 
a  steady-state  condition.  We  would  single  out  no  system  for  special 
attention. 

Now  assume  that  the  computation  has  been  set  as  a  cc  itinuous 
monitoring  function  and  that  the  October  results  were  those  of  Table  7. 

r 

1 

1 

c 


n 

1 
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Table  7 

COMPUTATIONS  FOR  AUGUST -OCTOBER  DATA 


System 

Entry 

August 

1 

September 

October 

Communication 

F 

o 

(-)243  .00 

( -)231 .00 

285  .00 

Ft 

248.11 

247.50 

263.39 

-  ■  1 

D2/Ft 

0.10 

1.10 

1.77 

Navigation 

F 

o 

225  .00 

218.00 

(-)212 .00 

Ft 

214.11 

213.59 

227.30 

D2/Ft 

0.55 

0.09 

1.03 

ECM 

F 

o 

302.00 

316.00 

( -) 296 .00 

Ft 

298  .78 

298 .05 

317.18 

D2/Ft 

0.03 

1.08 

1.41 

Bomb /Nav 

F 

o 

(-)388 .00 

411.00 

(-)419 .00 

Ft 

398.15 

397.18 

422.67 

D2/Ft 

0.26 

0.48 

0.03 

Autopilot 

F 

o 

76  .00 

(-)  55.00 

98.00 

Ft 

74.86 

74.68 

79.47 

D2/Ft 

0.02 

5.19 

4.32 

Chi-square  ■  17.46 
df  -  8 

2 

The  observed  y  of  17.46  falls  between  the  .05  and  .02  levels 

2 

of  the  tables  of  the  y  distribution  for  8  df.  It  is  unlikely  that 

the  distribution  of  the  F  is  due  to  random  variation.  Perusal  of 

o 

the  matrix  suggests  that  something  is  peculiar  with  the  Autopilot 
maintenance.  It  was  way  down  last  month,  and  way  up  this  month.  We 
have  some  "hunch  material"  to  pursue. 

Note:  The  test  did  not  show  that  Autopilot  was  the  culprit. 

The  test  indicated  that  the  entire  matrix  was  not  homogeneous.  The 
only  thing  we  know  is  that  the  system  (as  represented  by  the  five 
measures)  has  slipped  out  of  a  steady-state  condition.  We  have  a 
hunch  --  and  it  is  only  a  hunch  --  that  the  Autopilot  data  are  peculiar. 
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Hence ,  we  have  some  means  of  logically  ordering  our  attention  to  find 
reasons  why. 

In  the  two-way  case,  it  is  particularly  desirable  to  insure  that 
£“t  in  each  cell  is  greater  than  5  and  preferably  greater  than  10. 

There  will  be  little  trouble  with  AFM  66-1  data,  but  debriefing  data 
will  give  some  problems.  However,  there  is  a  way  out:  one  can  com¬ 
bine  rows  or  columns  where  such  combination  will  not  render  the  results 
meaningless . 

For  example,  "notorious  troublemakers"  were  used  in  the  previous 
example.  Had  we  desired  to  include  some  other  systems  (such  as  41000, 
45000,  47000,  49000),  it  probably  would  not  have  been  possible  to  get 
sufficient  Ft  to  compute.  In  this  instance,  all  these  entries  might 
be  combined  into  one  row:  "Miscellaneous  utilities,  hydraulic,  and 
pneumatic  systems." 

Let  us  sketch  a  few  matrices  that  might  be  set  up  as  monitoring 
devices.  We  shall  use  a  three-month  time  sample  to  make  the  discussion 
easier,  but  this  time  span  is  not  sacred,  since  the  matrix  is  not 
limited  to  three  columns.  We  can  use  any  number  from  two  on  up.  Nor 
is  a  month's  sample  mandatory.  It  could  be  lesser  or  greater  (i.e., 
weekly,  biweekly,  quarterly).  But  weekly  samples  of  much  66-1  data 
are  liable  to  be  subject  to  erratic  collection  procedures  (e.g., 

sometimes  the  data  are  in  by  Friday  afternoon  and  sometimes  not). 

2 

This  will  clobber  the  x  »  which  will  be  high  and  erratic.  As  before, 
the  nonoccurrence  contribution  must  be  included  where  binomial  data 
are  used . 

Some  examples  follow: 


Suggested  Categories 

Time  Sample 

July  August  September 

Suggested  Variables 

Basic  aircraft 

Power  plants 

Utilities 

Instruments  and  autopilot 

Write-ups 

Communications 

Units  Produced 

Weapon  delivery 

Etc . 
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We  would  also  use  the  type  of  sortie  flown  Instead  of  a  time 
sample  to  determine  If  there  Is  a  differential  effect.  For  example, 
If  It  Is  believed  that  hlgh-altltude  sorties  put  Increased  demand  on 
power  plants  and  electronic  equipment,  our  categories  could  be  high, 
medium,  and  low  altitude. 

Various  combinations  are  testable.  For  example,  there  is  some 
reason  to  believe  that  the  type  of  countermeasures  combined  with  the 
tracking  angle  have  a  differential  effect  on  Intercept  success.  This 
could  be  determined. 


Countermeasures 

Track  Angle 

Frontal  Beam  Stern 

Chaff 

Write-ups 

ECM 

Missed  intercepts. 

None 

ground  environment  error 

Both 

Missed  intercepts,  material 

Missed  intercepts,  flight  crew 

In  general,  it  will  be  more  desirable  to  use  the  one-way  case 
for  measuring  MCS  data,  such  as  SACR  66-7  and  ADCR  66-28,  thereby 
taking  advantage  of  the  Base-Line  used  by  that  method. 

This  completes  the  discussion  of  the  two-way  case.  TWo  cautions 
should  be  repeated  : 

1.  For  mathematical  reasons,  the  theoretical  cell  frequencies 
should  be  greater  than  5,  and  preferably  greater  than  10. 

The  small  theoretical  frequency  problem  can  occasionally  be 
circumvented  by  combining  categories  (e.g.,  two  months'  data 
instead  of  one,  or  combining  several  similar  categories  into 
an  "others"  or  a  "miscellany"  category). 

2.  The  chi-square  tests  the  entire  matrix  for  nonrandom  varia¬ 
tion.  The  method  of  isolating  large  contributors  to  the 
chi-square  is  only  a  method  to  help  infer  logical. areas  to 
explore  with  a  more  detailed  investigation. 
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IV.  SIMPLE  ANALYSIS  OF  VARIANCE 

In  Sec.  I  the  phrases  "more  detailed  investigation,"  "more 
careful  study,"  "more  critical  testing"  appeared  frequently.  The 
following  discussion  treats  one  of  these  methods  in  detail:  analysts 
of  variance. 

The  heart  of  analysis  of  variance  is  the  famous  F-test,  in  which 

p  _  between-groups  variance 
within-groups  variance 

Having  computed  F,  we  enter  an  F -table.  (Unlike  chi-square,  the 
F-test  involves  entering  two  numbers  (df)  instead  of  one  to  get  our 
probability  level.)  The  process  is  simple,  but  attempts  to  explain 
the  rationale  behind  it  frequently  end  in  confusion.  Rather,  let  us 
run  through  the  mechanics  of  computation. 

The  example  is  similar  to  the  one  used  before,  consisting  of  3 
samples:  222  write-ups  produced  by  43  regular  training  sorties,  204 
write-ups  produced  by  39  regular  training  sorties,  and  300  write-ups 
produced  by  38  training  sorties  immediately  after  the  aircraft  came 
out  of  a  hundred-hour  periodic  (see  Table  8).  The  question  is:  "Do 
periodics  cause  additional  work?" 

o 

The  tedious  part  of  the  computation  is  getting  the  sum  of  the  d 

2 

fe  d  ) .  We  need  the  sum  of  the  raw  scores  (y  X)  and  the  sum  of  the 

2 

raw  scores  squared  (y  X  ): 


I  d 


2 


rx2 


It  is  convenient  to  do  this  by  sample,  as  in  Table  9. 
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Table  8 


WRITE-UPS  PRODUCED  BY  TWO  SAMPLES  OP  43  AND  39  REGULAR 
TRAINING  SORTIES  AND  ONE  SAMPLE  OF  38  TRAINING  SORTIES 
WITH  AN  AIRCRAFT  JUST  OUT  OF  PERIODIC 


Sample  1 

Sample  2 

Sample  3 

12 

13 

13 

11 

9 

12 

10 

9 

12 

10 

8 

11 

10 

8 

11 

8 

8 

10 

8 

8 

10 

7 

7 

10 

7 

7 

10 

7 

7 

10 

7 

7 

9 

7 

7 

9 

7 

7 

9 

7 

6 

9 

7 

6 

9 

6 

6 

9 

6 

6 

8 

6 

5 

8 

6 

5 

8 

6 

5 

8 

5 

5 

8 

5 

5 

8 

5 

4 

8 

5 

4 

8 

5 

4 

7 

4 

4 

7 

4 

4 

7 

4 

4 

7 

4 

4 

7 

3 

4 

7 

3 

3 

6 

3 

3 

6 

3 

3 

4 

3 

2 

4 

2 

2 

3 

2 

2 

3 

2 

1 

3 

1 

1 

2 

1 

1 

1 

1 

1 

0 
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Table  9 


ILLUSTRATIONS  OF  BASIC  COMPUTATIONS 
FOR  GETTING  THE  I  d 2 


Scores  (Sample  1) 


X 

X2 

12 

144 

11 

121 

10 

100 

10 

100 

10 

100 

8 

64 

Etc . 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

0 

,0 

EX  =  222 

EX  -  1516 

o 

Once  the  E  X  and  the  E  X  are  computed,  90  percent  of  the  work  is 
done . 

Two  techniques  are  available  to  reduce  the  burden : 

2 

1.  With  a  de3k  calculator,  the  E  X  and  the  E  X  are  obtained  in 
one  pass  by  squaring  each  X  using  the  "accumulating-raultiply" 
button.  (It  took  less  than  2  minutes  to  get  the  complete 

E  X  and  E  X^  for  Sample  1,  illustrated  in  Table  4.) 

2 

2.  A  simple  EAM  procedure  will  give  E  X,  E  X  ,  and  N.  See 
Appendix  B. 

2 

In  a  similar  manner,  we  get  E  X  and  E  X  for  each  of  the  samples: 


Sample  1 

Sample  2 

Sample  3 

Total 

£  x0 

222 

204 

300 

726 

E  X2 

1516 

1320 

2626 

5462 

N 

43 

39 

38 

120 

flie  basic  computations  follow. 

2 

We  are  trying  to  get  two  numbers:  1)  E  d  between  samples;  and 
2 

2)  E  d  within  samples.  The  easiest  way  to  get  these  two  numbers  is 

2  2 
to  first  get  the  Total  E  d  and  then  subtract  the  between  E  d  .  The 

process  follows  : 


-29- 


2  2 

To  get  the  total  E  d  ,  add  up  the  £  X  and  £  X  obtained  (Total 
column  --  see  preceding  tabulation)  and  apply  the  formula 

E  d2  -  £  X2  -  X>2 


N 


Total  £  d  -  5462  - 


(726) z 
120 

5462  -  4392.3 
1069.70. 


The  between  Ed  is  somewhat  easier.  The  individual  [  X's  and 

n's  are  used:  9  ?  2  o 

Between  £  d2  -  «“>?  +  .  139.34. 

43  39  38  120 


Note  that  the  correction  term 


(726)' 

120 


is  the  same  in  both  equations. 


The  within  Ed  is  obtained  by  subtraction: 

2  2  2 
Within  £d  ■  total  E  d  -  between  E  d 

-  1069.70  -  189.34  -  880.36 


The  last  needs  are  the  degrees  of  freedom.  As  before,  df  is 

always  something  minus  one.  In  this  instance: 

Total  df  ■  120  -  1  ■  119  (120  measures) 

Between  df  *  3  -  1  ■  2  (3  samples) 

Within  df  ■  119  -  2  ■  117  (total  df  -  between  df) 

Or  one  can  derive  the  within  df  directly  (it  is  a  good  check) 

by  going  directly  to  the  sample  n's. 

43  -  1  -  42 

39  -  1  -  38 

38  -  1  -  37 

Total  ■  117  ■  within  df 

Table  10  has  all  the  information  laid  out  in  conventional  format. 
(Although  the  total  entries  are  not  used,  they  are  included  to  give 
the  complete  story.) 
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Table  10 

COMPLETE  ANALYS IS -CF- VARIANCE  SUMMARY 


Source  of 
Variation 

Sum  of 
Squares 

df 

Mean 

Square 

Between 

189.34 

2 

94.67 

Wi  thin 

880.36 

117 

7.52 

Total 

1069.70 

119 

The  mean  square  is  obtained  by  dividing  each  Ed  by  its  df . 
(189.34/2  -  94.67  and  880.36/117  =  7.52).  F  =  94.67/7.52  =  12.58. 

Reading  the  table  of  the  F  distribution  (Appendix  B)  is  slightly 
different  from  reading  a  table  of  chi-square.  In  reading  F -tables 
one  must  find  the  cell  represented  by  the  two  df's.  (In  this  example, 
between  df  »  2  and  within  df  =  117,  so  we  go  to  the  second  column  and 
117th  row.  There  is  no  row  117,  so  we  take  row  125,  the  next  best 
thing.)  The  cell  in  the  F  table  shows  two  entries:  one  for  the  .05 
(the  smaller  number)  and  one  for  the  .01  levels  of  probability.  We 
check  to  see  if  ours  is  bigger  than  either  of  these. 

We  might  digress  for  a  moment  to  explain  a  confusing  element  in 
the  way  F-tables  are  conventionally  laid  out.  The  common  tables  have 
a  note,  "degrees  of  fnedom  (for  greater  mean  square)"  over  the  columns. 
Strike  the  word  "greater"  and  insert  the  word  "between".  You  must  use 
rows  to  find  within ,  and  columns  to  find  between .  You  will  quickly 
discover  that  if  the  computed  F  is  less  than  1.0,  there  is  no  point  in 
bothering  to  look  it  up. 

In  our  example,  F  =  12.58  and  the  2  x  125  cell  contained  3.07 
and  4.78.  Our  F  is  bigger  than  4.78  (the  .01  level),  so  we  say:  the 
odds  are  more  than  100  to  1  against  our  getting  an  F  this  large  by 
chance.  We  reject  the  hypothesis  that  the  differences  among  the  samples 
is  due  to  random  variation.  If  our  F  had  been  smaller  than  3.07  (the 
.05  level),  we  would  have  said  that  there  are  more  than  5  chances  in 
100  of  getting  an  F  this  big  just  from  random  variation.  Therefore, 
we  could  not  say  that  the  numbers  have  any  significance  to  us  --  that 
is,  random  variation  could  account  for  any  of  the  differences.  (Or, 
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to  use  previous  terms,  the  numbers  are  the  same  as  we  might  have 
obtained  by  rolling  the  same  set  of  dice.) 

But  the  F  test  of  the  sample  was  high  (  p  <  .01).  Something, 
apparently,  has  happened  to  cause  the  numbers  to  vary  as  they  did. 
The  numbers  suddenly  become  interesting.  Let  us  compute  the  means. 
We  already  have  the  numbers  : 

Mean  =  £X/N 

Regular  Flyer  1... 222/43  =  5.16 
Regular  Flyer  2... 204/39  =  5.23 
Post-perii  lie . 300/38  =  7.89 


The  Post-periodic  mean  is  obviously  not  what  it  should  be.  It 
is  costing  us  an  average  of  2.7  additional  write-ups.  We  will  want 
to  check  the  nature  of  the  write-ups  to  see  if  preventive  measures 
are  possible. 

l^iis  is,  perhaps,  too  much  discussion  and  not  enough  figuring. 

Let  us  take  another  critical  problem.  Records  are  kept  of  drops  made 
by  five  different  bomb-nav  systems,  measured  in  "yards  from  the  shack" 
The  question  is:  "Have  one  or  more  systems  'gone  sick1  or  is  the 
variation  among  the  five  systems  random?" 

The  data  and  computations  follow: 


Systems 


(1) 

(2) 

(3) 

(4) 

(5) 

89 

07 

38 

27 

97 

04 

86 

31 

32 

98 

98 

83 

91 

66 

51 

41 

08 

38 

11 

40 

28 

41 

02 

17 

26 

65 

00 

11 

62 

63 

05 

86 

48 

62 

63 

39 

33 

75 

36 

64 

61 

87 

93 

65 

43 

29 

29 

97 

85 

i _ 

75 

NOTE:  The  first  drop  for  System  1 
was  89  yards  off,  the  second  04  yards  off, 
etc . 
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System 

Total 

1 

2 

3 

4 

5 

E  X 

498 

477 

259 

439 

818 

£  £  X  =  2,491 

E  X2 

34,098 

30,975 

14,559 

27,401 

61,688 

E  £  X2  «  168,721 

N 

10 

11 

7 

9 

13 

N  =  50 

NOTE:  The  expression  £  E  means  "the  sum  of  the  sums." 


2  (y  Xi 

Total  sum  of  squares  *  £  X  -  **  ' 

N 

2 

-  168,721  -  -  44,619.38. 


df  total 


■N  -  1=  50  -1  =  49. 


Between  sum  of  squares 


df  between 

Within  sum  of  squares 

df  within 


in)2  .  cix)2  (z  £  x)2 

nl  n2  N 

W8)2  +  IfeZZji  +  .  C2,^1)2  , 

10  +  11  50 


number  of  samples  -1 
total  sum  of  squares 
44,619.38  -  3,850.76 
total  -  between 
49  -  4  -  45. 


=  5  -1  =  4. 

-  between  sum  of  squares 
«  40,768.62 


Source  of 
Variation 

Sum  of 
Squares 

df 

Mean 

F 

Between  Groups 

3,850.76 

4 

962.69 

1.06 

Within  Groups 

40,768.62 

45 

905  .97 

Total 

44,619.38 

49 

Checking  the  F  Table  with  the  4  and  45  df,  we  find  the  level  of 
significance  does  not  come  close  to  the  .05  level.  This  means  that 
the  between-systems  deviation  Is  the  same  variation  we  would  get  If 
we  had  drawn  all  the  samples  from  a  table  of  random  numbers,  which 
these  were.  And  our  Commander  or  the  armament  and  electronics  (A  and 
E)  squadron  can  relax  with  the  assurance  that  none  of  the  systems  has 
gone  sour . 
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One  of  the  advantages  of  analysis  of  variance  is  nicely  illus¬ 
trated.  As  with  chi-square,  one  can  test  a  lot  of  samples  simul¬ 
taneously;  we  could  easily  have  tested  10,  20,  or  30  systems,  without 
much  increase  in  effort.  (Had  we  tested  each  sample  against  each 
other  sample  by  the  conventional  mean-difference  methods,  we  would 
have  had  to  compute  10  separate  tests  of  mean  differences,  and  still 
would  not  have  the  "big  picture".)  But  with  analysis  of  variance  we 
do  the  testing  in  one  fell  swoop.  The  shortcoming  of  analysis  of 
variance  (like  chi-square)  is  that  the  entire  set  of  measures  is 
tested;  thus,  we  cannot  (with  analysis  of  variance  alone)  isolate 
the  specific  offenders.  However,  as  we  shall  show,  there  is  a  way 
of  solving  this  problem. 

ITie  analysis  of  variance  is  admirably  suited  for  monitoring 
trends,  if  the  suggested  way  of  computing  (i.e.,  by  samples)  is 
followed  . 

Assume  that  we  are  interested  in  monitoring  the  number  of  train¬ 
ing  items  lost  due  to  equipment  malfunction.  For  each  training  sortie 
we  record  the  number  of  items  lost,  and  at  the  end  of  each  week  the 

sorties  are  counted  (N),  the  number  of  items  lost  in  each  sortie  is 

2 

simultaneously  summed  (E  X),  and  squared  and  summed  (EX).  We  also 
compute  the  mean.  The  accumulations  appear  thus: 


_ Weeks 

1 

2 

3 

4 

E  X 

11 

8 

12 

12 

E  X2 

15 

10 

22 

24 

N 

16 

16 

15 

14 

Mean 

0.68 

0.50 

0.80 

0.86 

In  practice,  the  data  above  are  monitored.  Any  time  the  trend 
seems  to  be  taking  a  turn  for  the  worse  (as  in  Week  4,)  the  analysis 
of  variance  is  computed  to  determine  if  the  variation  is  out  of  bounds. 


Source  of 
Variation 

Sum  of 
Squares 

df 

Mean 

Square 

F 

Between  weeks 

1.14 

3 

0.38 

0.55 

Within  weeks 

39.55 

57 

0.69 

Total 

40.69 

60 
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Entering  the  F  tables  with  3  and  57  df,  we  find  that  the  F  does 
not  reach  the  0.05  level;  hence,  we  infer  that  the  variation  shown  is 
random . 

It  is  possible  to  draw  a  picture  of  what  the  analysis  of  variance 
has  shown  us  (i.e.,  whether  or  not  the  sample  means  are  outside  the 
limits  of  the  mean  of  the  entire  set  of  samples). 

To  do  this  we  need  the  grand  mean  of  the  set  of  samples  and  the 
standard  error  of  the  grand  mean  (c^) .  This  is  painless  work  since 
we  have  already  done  the  drudgery.  The  problem  used  before 
will  serve. 

The  totals  obtained  were  : 

E  X  -  726 
E  X2-  5462 

N  -  120  2 

E  d2-  1069.70  (i.e.,  5462  -  ). 

The  grand  mean  is  : 

726/120  =  6.05. 

* 

We  now  need  to  find  the  standard  error  of  the  mean: 

J77~ 

°m  V  N(N-l)  ' 

Hence  the  standard  error  of  the  mean  is  : 


m 


1069.70 
(120) (119) 


0.27. 


The  range  included  by  the  mean  1  2  o  includes  95  percent  of  the 

m 

grand  means  we  would  get  if  we  repeated  the  sampling  hundreds  of  times. 


The  formula  for  the  standard  error  of  the  mean  can  be  found  in 

any  statistics  textbook,  which  will  show  also  that  the  sample  mean 

has  a  normal  or  bell -shaped  distribution  with  approximately  95  percent 

of  the  sample  mean  distributed  about  the  true  mean  plus  or  minus  two 

standard  errors,  o  • 

*  m 


In  this  case , 

2  o  -  (2)  (0.27)  -  0.54 
lower  Limit  ■  6.05  -  0.54  ■  5.51 
upper  limit  ■  6.05  +  0.54  ■  6.59. 

Of  the  sample  means,  95  percent  would  fall  between  5.51  and  6.59 
Or  5  percent  would  fall  outside  the  range  5.51  -  6.59.  This  is  the 
same  5-percent  level  (p  =  0.05)  we  have  used  before. 

We  can  now  make  a  picture  of  our  results  by  plotting  the  means 
and  the  ±2  a  limits.  (The  horizontal  axis  has  no  consequence.) 


8 


7 

+2  a  limit 
m 

6 

-2  a  limit 
m 

5 


A 

t 

Post-Alert  Flyers 

.  (6.59) 

. (GM  ■=  6.05) 


Regular  Flyers 
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Note  that  grand  mean  (GM)  provides  a  Base-Line.  The  critical 

factor  is  the  distance  from  this  Base-Line,  while  the  +2<r  limits 

—  tn 

provide  meaning  to  this  distance.  In  the  example,  we  can  see  that  the 

samples  of  regular  flyers  are  considerably  less  than  1/2  apart, 

and  both  are  several  a  from  the  post-alert  flyers. 

m 

The  F-test  told  us  there  is  little  likelihood  tha:  all  three 

samples  were  drawn  from  the  same  population.  Plotting  the  data  in 

terms  of  GM  and  o  gives  us  an  idea  of  what  has  happened.  We  feel 

m 

fairly  sure  that  the  two  regular-flyer  samples  come  from  one  popula¬ 
tion  and  that  the  post -alert  flyers  come  from  another.  We  now  are 
fairly  confident  that  a  long  stay  in  home  alert  does  have  a  degrading 
effect  on  the  aircraft. 
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Appendix  A 

COMPUTATIONAL  TECHNIQUES 

The  most  tedious  part  of  any  statistical  computation  is  getting 

2 

IX  (sum  of  the  raw  scores)  and  DC  (sum  of  each  raw  score  squared). 
Once  these  are  obtained,  the  remaining  calculations  rarely  take  more 
than  15  minutes  per  problem. 

The  first  suggestion  is:  never  copy  data.  Record  them  in  the 
form  in  which  they  will  be  used.  In  the  case  of  AFM  66-1  and  SACM 
66-7  data,  such  a  form  is  provided.  For  special  reports,  we  suggest 
using  AF  Form  1530  (the  80-column  key-punch  form)  for  worksheets. 

Then,  if  the  number  gets  too  frequent  for  hand  computation,  the  sheets 
can  be  key-punched  as  a  preliminary  to  taking  advantage  of  the  Base 
PC  AM. 

There  are  thre'i  basic  rules: 

1.  Never  do  a  calculation  by  hand  if  you  have  a  calculator 
(or  slide  rule). 

2.  Never  do  it  by  calculator  if  you  have  a  table  of  squares. 

3.  Never  follow  either  of  the  two  previous  rules  if  PCAM  is 
available.  Generally  speaking,  PCAM  can  add  and  subtract 
readily,  but  can  multiply  and  divide  only  with  extreme 
anguish . 

SINGLE-SAMPLE  CHI-SQUARE 

If  the  data  have  been  key-punched,  obtain  a  listing  like  the 
following  (have  the  PCAM  group  sum  the  columns)  : 


Base-Line  (BL)  Data 

F 

0 

8 

7 

4 

4 

6 

5 

etc . 

etc . 

2  -  128 

2  -  95 

The  computation  of  F  is: 

Theoretical  frequency  ■  Li./£BL  x  E  observed  frequencies;  e.g., 

Xp  =  (8/128)  x  (95). 
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Since  95/128  =  0.742  Is  a  constant,  lock  0.742  in  the  keyboard,  then 
multiply  each  Base-Line  entry  by  the  constant. 

Only  two  accuracy  checks  are  possible.  £  Fq  must  equal  £  F 
(within  rounding  error)  or  the  use  of  the  chi-square  will  be  invali¬ 
dated.  And  the  £  D  (differences  between  Fq  and  F^)  will  equal  zero 

(within  rounding  error). 

2  2 
Because  the  D  may  become  large,  it  is  tempting  to  divide  D  by 

a  constant.  If  you  do,  do  not  forget  to  multiply  it  by  the  same  con 

stant  afterward;  otherwise  the  resulting  chi-square  will  be  shrunk 

to  insignificance.  Some  people  like  to  use  the  10°  approach: 

3216  “  3.216  x  103 


MULTI-SAMPLE  CHI-SQUARE 

If  the  data  matrices  are  large  (more  than  50  cells)  or  if  there 
are  many  of  them,  one  should  think  about  using  PCAM  for  assistance. 
Without  strain,  PCAM  can  give  both  the  sums  of  the  rows  (they  call  it 
cross -footing)  and  the  sums  of  the  columns  (accumulating)  with  the 
grand  total  thrown  in  free. 

Again  the  use  of  AF  Form  1530  is  urged.  When  you  have  many 
different  kinds  of  samples  to  run,  try  to  use  the  same  columns  on  the 
Form  1530's.  This  way  extensive  rewiring  of  boards  is  avoided. 

ANALYSIS  OF  VARIANCE 

When  the  data  come  in  strung  out  over  time,  you  can  avoid  a  flap 

2 

by  computing  as  you  go  along--that  is,  computing  the  E  X  and  £  X  at 
the  end  of  each  day.  Then  little  work  remains  by  the  end  of  the  week. 
Also,  it  is  easier  to  check  calculation  accuracy  when  a  large  sample 
is  broken  into  several  small  parts.  (If  you  can  get  a  printing  calcu¬ 
lator,  accuracy  checks  are  painless.) 

The  use  of  PCAM  can  be  a  big  aid,  but  it  takes  a  little  prior 
planning  and  discussion  with  the  PCAM  people.  There  are  several 
solutions.  The  one  shown  is  not  necessarily  the  best,  but  it  is  the 
easiest  to  understand. 
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Lay  out  the  samples  on  the  1530  Form: 


Samp le  1 

Sample  2 

Sample  3 

24 

47 

94 

69 

35 

31 

85 

63 

102 

etc . 

etc . 

etc . 

Then  have  key -punch  make  up  a  deck  of  X  cards  : 


Card  111 
Card  22  4 
Card  339 
Card  4  4  16 


etn. 


Sample  1  of  the  data  cards  is  sorted  in  the  same  order  as  the 

2 

X  cards  .  The  cards  are  merged  (on  the  collater)  and  the  tab  is 

2 

wired  to  print  both  X  and  X  for  each  X  in  the  data.  The  card  count 

is  the  N.  (If  zero  is  a  significant  measure--such  as  "no  malfunctions" 

2. 

--include  a  zero  card  in  the  X  deck  to  make  the  card-count  correct.) 

The  data  cards  are  then  re-sorted  on  Sample  2  and  the  process 
repeated.  If  the  data  samples  are  on  separate  cards,  you  can  also 
get  minor  and  intermediate  totals,  along  with  card  counts--in  short, 
the  complete  information  given  in  the  tables  used  to  illustrate  the 
method  . 


Appendix  B 

SOME  CRITICAL  VALUES  OF  F  AND  CHI-SQUARE. 


Table  12 

VALUES  OF  F  AT  THE  FIVE  PER  CENT  (LIGHTFACE  TYPE)  AND  ONE  PER  CENT  (BOLDFACE  TYPE)  LEVELS  OF  SIGNIFICANCE 
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